24 research outputs found
Hidden Biases of End-to-End Driving Models
End-to-end driving systems have recently made rapid progress, in particular
on CARLA. Independent of their major contribution, they introduce changes to
minor system components. Consequently, the source of improvements is unclear.
We identify two biases that recur in nearly all state-of-the-art methods and
are critical for the observed progress on CARLA: (1) lateral recovery via a
strong inductive bias towards target point following, and (2) longitudinal
averaging of multimodal waypoint predictions for slowing down. We investigate
the drawbacks of these biases and identify principled alternatives. By
incorporating our insights, we develop TF++, a simple end-to-end method that
ranks first on the Longest6 and LAV benchmarks, gaining 11 driving score over
the best prior work on Longest6.Comment: Accepted at ICCV 2023. Camera ready versio
Quadtree Generating Networks: Efficient Hierarchical Scene Parsing with Sparse Convolutions
Semantic segmentation with Convolutional Neural Networks is a
memory-intensive task due to the high spatial resolution of feature maps and
output predictions. In this paper, we present Quadtree Generating Networks
(QGNs), a novel approach able to drastically reduce the memory footprint of
modern semantic segmentation networks. The key idea is to use quadtrees to
represent the predictions and target segmentation masks instead of dense pixel
grids. Our quadtree representation enables hierarchical processing of an input
image, with the most computationally demanding layers only being used at
regions in the image containing boundaries between classes. In addition, given
a trained model, our representation enables flexible inference schemes to
trade-off accuracy and computational cost, allowing the network to adapt in
constrained situations such as embedded devices. We demonstrate the benefits of
our approach on the Cityscapes, SUN-RGBD and ADE20k datasets. On Cityscapes, we
obtain an relative 3% mIoU improvement compared to a dilated network with
similar memory consumption; and only receive a 3% relative mIoU drop compared
to a large dilated network, while reducing memory consumption by over
4.Comment: Accepted for IEEE Winter Conference on Applications of Computer
Vision (WACV) 202
Parting with Misconceptions about Learning-based Vehicle Motion Planning
The release of nuPlan marks a new era in vehicle motion planning research,
offering the first large-scale real-world dataset and evaluation schemes
requiring both precise short-term planning and long-horizon ego-forecasting.
Existing systems struggle to simultaneously meet both requirements. Indeed, we
find that these tasks are fundamentally misaligned and should be addressed
independently. We further assess the current state of closed-loop planning in
the field, revealing the limitations of learning-based methods in complex
real-world scenarios and the value of simple rule-based priors such as
centerline selection through lane graph search algorithms. More surprisingly,
for the open-loop sub-task, we observe that the best results are achieved when
using only this centerline as scene context (i.e., ignoring all information
regarding the map and other agents). Combining these insights, we propose an
extremely simple and efficient planner which outperforms an extensive set of
competitors, winning the nuPlan planning challenge 2023.Comment: CoRL 202
End-to-end Autonomous Driving: Challenges and Frontiers
The autonomous driving community has witnessed a rapid growth in approaches
that embrace an end-to-end algorithm framework, utilizing raw sensor input to
generate vehicle motion plans, instead of concentrating on individual tasks
such as detection and motion prediction. End-to-end systems, in comparison to
modular pipelines, benefit from joint feature optimization for perception and
planning. This field has flourished due to the availability of large-scale
datasets, closed-loop evaluation, and the increasing need for autonomous
driving algorithms to perform effectively in challenging scenarios. In this
survey, we provide a comprehensive analysis of more than 250 papers, covering
the motivation, roadmap, methodology, challenges, and future trends in
end-to-end autonomous driving. We delve into several critical challenges,
including multi-modality, interpretability, causal confusion, robustness, and
world models, amongst others. Additionally, we discuss current advancements in
foundation models and visual pre-training, as well as how to incorporate these
techniques within the end-to-end driving framework. To facilitate future
research, we maintain an active repository that contains up-to-date links to
relevant literature and open-source projects at
https://github.com/OpenDriveLab/End-to-end-Autonomous-Driving